Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 313
Filtrar
1.
Rev. neurol. (Ed. impr.) ; 78(7): 209-211, Ene-Jun, 2024.
Artigo em Espanhol | IBECS | ID: ibc-232183

RESUMO

Las revistas científicas más importantes en campos como medicina, biología y sociología publican reiteradamente artículos y editoriales denunciando que un gran porcentaje de médicos no entiende los conceptos básicos del análisis estadístico, lo que favorece el riesgo de cometer errores al interpretar los datos, los hace más vulnerables frente a informaciones falsas y reduce la eficacia de la investigación. Este problema se extiende a lo largo de toda su carrera profesional y se debe, en gran parte, a una enseñanza deficiente en estadística que es común en países desarrollados. En palabras de H. Halle y S. Krauss, ‘el 90% de los profesores universitarios alemanes que usan con asiduidad el valor de p de los test no entiende lo que mide ese valor’. Es importante destacar que los razonamientos básicos del análisis estadístico son similares a los que realizamos en nuestra vida cotidiana y que comprender los conceptos básicos del análisis estadístico no requiere conocimiento matemático alguno. En contra de lo que muchos investigadores creen, el valor de p del test no es un ‘índice matemático’ que nos permita concluir claramente si, por ejemplo, un fármaco es más efectivo que el placebo. El valor de p del test es simplemente un porcentaje.(AU)


Abstract. Leading scientific journals in fields such as medicine, biology and sociology repeatedly publish articles and editorials claiming that a large percentage of doctors do not understand the basics of statistical analysis, which increases the risk of errors in interpreting data, makes them more vulnerable to misinformation and reduces the effectiveness of research. This problem extends throughout their careers and is largely due to the poor training they receive in statistics – a problem that is common in developed countries. As stated by H. Halle and S. Krauss, ‘90% of German university lecturers who regularly use the p-value in tests do not understand what that value actually measures’. It is important to note that the basic reasoning of statistical analysis is similar to what we do in our daily lives and that understanding the basic concepts of statistical analysis does not require any knowledge of mathematics. Contrary to what many researchers believe, the p-value of the test is not a ‘mathematical index’ that allows us to clearly conclude whether, for example, a drug is more effective than a placebo. The p-value of the test is simply a percentage.(AU)


Assuntos
Humanos , Masculino , Feminino , Pesquisa Biomédica , Publicação Periódica , Publicações Científicas e Técnicas , Testes de Hipótese , Valor Preditivo dos Testes
2.
Proc Natl Acad Sci U S A ; 121(15): e2322083121, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38568975

RESUMO

While reliable data-driven decision-making hinges on high-quality labeled data, the acquisition of quality labels often involves laborious human annotations or slow and expensive scientific measurements. Machine learning is becoming an appealing alternative as sophisticated predictive techniques are being used to quickly and cheaply produce large amounts of predicted labels; e.g., predicted protein structures are used to supplement experimentally derived structures, predictions of socioeconomic indicators from satellite imagery are used to supplement accurate survey data, and so on. Since predictions are imperfect and potentially biased, this practice brings into question the validity of downstream inferences. We introduce cross-prediction: a method for valid inference powered by machine learning. With a small labeled dataset and a large unlabeled dataset, cross-prediction imputes the missing labels via machine learning and applies a form of debiasing to remedy the prediction inaccuracies. The resulting inferences achieve the desired error probability and are more powerful than those that only leverage the labeled data. Closely related is the recent proposal of prediction-powered inference [A. N. Angelopoulos, S. Bates, C. Fannjiang, M. I. Jordan, T. Zrnic, Science 382, 669-674 (2023)], which assumes that a good pretrained model is already available. We show that cross-prediction is consistently more powerful than an adaptation of prediction-powered inference in which a fraction of the labeled data is split off and used to train the model. Finally, we observe that cross-prediction gives more stable conclusions than its competitors; its CIs typically have significantly lower variability.

3.
Comput Biol Med ; 173: 108349, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38547660

RESUMO

BACKGROUND: Ventilator dyssynchrony (VD) can worsen lung injury and is challenging to detect and quantify due to the complex variability in the dyssynchronous breaths. While machine learning (ML) approaches are useful for automating VD detection from the ventilator waveform data, scalable severity quantification and its association with pathogenesis and ventilator mechanics remain challenging. OBJECTIVE: We develop a systematic framework to quantify pathophysiological features observed in ventilator waveform signals such that they can be used to create feature-based severity stratification of VD breaths. METHODS: A mathematical model was developed to represent the pressure and volume waveforms of individual breaths in a feature-based parametric form. Model estimates of respiratory effort strength were used to assess the severity of flow-limited (FL)-VD breaths compared to normal breaths. A total of 93,007 breath waveforms from 13 patients were analyzed. RESULTS: A novel model-defined continuous severity marker was developed and used to estimate breath phenotypes of FL-VD breaths. The phenotypes had a predictive accuracy of over 97% with respect to the previously developed ML-VD identification algorithm. To understand the incidence of FL-VD breaths and their association with the patient state, these phenotypes were further successfully correlated with ventilator-measured parameters and electronic health records. CONCLUSION: This work provides a computational pipeline to identify and quantify the severity of FL-VD breaths and paves the way for a large-scale study of VD causes and effects. This approach has direct application to clinical practice and in meaningful knowledge extraction from the ventilator waveform data.


Assuntos
Lesão Pulmonar , Humanos , Ventiladores Mecânicos , Pulmão/fisiologia , Respiração Artificial/métodos
4.
Prog Transplant ; : 15269248241237823, 2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38449093
5.
Philos Trans A Math Phys Eng Sci ; 382(2270): 20230140, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38403052

RESUMO

The collective statistics of voting on judicial courts present hints about their inner workings. Many approaches for studying these statistics, however, assume that judges' decisions are conditionally independent: a judge reaches a decision based on the case at hand and his or her personal views. In reality, judges interact. We develop a minimal model that accounts for judge bias, depending on the context of the case, and peer interaction. We apply the model to voting data from the US Supreme Court. We find strong evidence that interaction is an important factor across natural courts from 1946 to 2021. We also find that, after accounting for interaction, the recovered biases differ from highly cited ideological scores. Our method exemplifies how physics and complexity-inspired modelling can drive the development of theoretical models and improved measures for political voting. This article is part of the theme issue 'A complexity science approach to law and governance'.

8.
Rev. neurol. (Ed. impr.) ; 78(1)1 - 15 de Enero 2024. tab
Artigo em Espanhol | IBECS | ID: ibc-229062

RESUMO

Una práctica muy habitual en la investigación médica, durante el proceso de análisis de los datos, es dicotomizar variables numéricas en dos grupos. Dicha práctica conlleva la pérdida de información muy útil que puede restar eficacia a la investigación. A través de varios ejemplos, se muestra cómo con la dicotomización de variables numéricas los estudios pierden potencia estadística. Esto puede ser un aspecto crítico que impida valorar, por ejemplo, si un procedimiento terapéutico es más efectivo o si un determinado factor es de riesgo. Por tanto, se recomienda no dicotomizar las variables continuas si no existe un motivo muy concreto para ello. (AU)


Abstract. A very common practice in medical research, during the process of data analysis, is to dichotomise numerical variables in two groups. This leads to the loss of very useful information that can undermine the effectiveness of the research. Several examples are used to show how the dichotomisation of numerical variables can lead to a loss of statistical power in studies. This can be a critical aspect in assessing, for example, whether a therapeutic procedure is more effective or whether a certain factor is a risk factor. Dichotomising continuous variables is therefore not recommended unless there is a very specific reason to do so. (AU)


Assuntos
Pesquisa Biomédica/estatística & dados numéricos , Modelos Estatísticos
9.
Stat Med ; 43(6): 1103-1118, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38183296

RESUMO

Regression modeling is the workhorse of statistics and there is a vast literature on estimation of the regression function. It has been realized in recent years that in regression analysis the ultimate aim may be the estimation of a level set of the regression function, ie, the set of covariate values for which the regression function exceeds a predefined level, instead of the estimation of the regression function itself. The published work on estimation of the level set has thus far focused mainly on nonparametric regression, especially on point estimation. In this article, the construction of confidence sets for the level set of linear regression is considered. In particular, 1 - α $$ 1-\alpha $$ level upper, lower and two-sided confidence sets are constructed for the normal-error linear regression. It is shown that these confidence sets can be easily constructed from the corresponding 1 - α $$ 1-\alpha $$ level simultaneous confidence bands. It is also pointed out that the construction method is readily applicable to other parametric regression models where the mean response depends on a linear predictor through a monotonic link function, which include generalized linear models, linear mixed models and generalized linear mixed models. Therefore, the method proposed in this article is widely applicable. Simulation studies with both linear and generalized linear models are conducted to assess the method and real examples are used to illustrate the method.


Assuntos
Modelos Estatísticos , Humanos , Modelos Lineares , Análise de Regressão , Simulação por Computador
10.
Vox Sang ; 119(1): 34-42, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38018286

RESUMO

BACKGROUND AND OBJECTIVES: Although the genetic determinants of haemoglobin and ferritin have been widely studied, those of the clinically and globally relevant iron deficiency anaemia (IDA) and deferral due to hypohaemoglobinemia (Hb-deferral) are unclear. In this investigation, we aimed to quantify the value of genetic information in predicting IDA and Hb-deferral. MATERIALS AND METHODS: We analysed genetic data from up to 665,460 participants of the FinnGen, Blood Service Biobank and UK Biobank, and used INTERVAL (N = 39,979) for validation. We performed genome-wide association studies (GWASs) of IDA and Hb-deferral and utilized publicly available genetic associations to compute polygenic scores for IDA, ferritin and Hb. We fitted models to estimate the effect sizes of these polygenic risk scores (PRSs) on IDA and Hb-deferral risk while accounting for the individual's age, sex, weight, height, smoking status and blood donation history. RESULTS: Significant variants in GWASs of IDA and Hb-deferral appear to be a small subset of variants associated with ferritin and Hb. Effect sizes of genetic predictors of IDA and Hb-deferral are similar to those of age and weight which are typically used in blood donor management. A total genetic score for Hb-deferral was estimated for each individual. The odds ratio estimate between first decile against that at ninth decile of total genetic score distribution ranged from 1.4 to 2.2. CONCLUSION: The value of genetic data in predicting IDA or suitability to donate blood appears to be on a practically useful level.


Assuntos
Anemia Ferropriva , Humanos , Anemia Ferropriva/genética , Estudo de Associação Genômica Ampla , Ferritinas/genética , Hemoglobinas/análise
11.
Rev. saúde pública (Online) ; 58: 01, 2024. graf
Artigo em Inglês | LILACS | ID: biblio-1536768

RESUMO

ABSTRACT OBJECTIVE This study aims to propose a comprehensive alternative to the Bland-Altman plot method, addressing its limitations and providing a statistical framework for evaluating the equivalences of measurement techniques. This involves introducing an innovative three-step approach for assessing accuracy, precision, and agreement between techniques, which enhances objectivity in equivalence assessment. Additionally, the development of an R package that is easy to use enables researchers to efficiently analyze and interpret technique equivalences. METHODS Inferential statistics support for equivalence between measurement techniques was proposed in three nested tests. These were based on structural regressions with the goal to assess the equivalence of structural means (accuracy), the equivalence of structural variances (precision), and concordance with the structural bisector line (agreement in measurements obtained from the same subject), using analytical methods and robust approach by bootstrapping. To promote better understanding, graphical outputs following Bland and Altman's principles were also implemented. RESULTS The performance of this method was shown and confronted by five data sets from previously published articles that used Bland and Altman's method. One case demonstrated strict equivalence, three cases showed partial equivalence, and one showed poor equivalence. The developed R package containing open codes and data are available for free and with installation instructions at Harvard Dataverse at https://doi.org/10.7910/DVN/AGJPZH. CONCLUSION Although easy to communicate, the widely cited and applied Bland and Altman plot method is often misinterpreted, since it lacks suitable inferential statistical support. Common alternatives, such as Pearson's correlation or ordinal least-square linear regression, also fail to locate the weakness of each measurement technique. It may be possible to test whether two techniques have full equivalence by preserving graphical communication, in accordance with Bland and Altman's principles, but also adding robust and suitable inferential statistics. Decomposing equivalence into three features (accuracy, precision, and agreement) helps to locate the sources of the problem when fixing a new technique.


Assuntos
Intervalos de Confiança , Análise de Regressão , Interpretação Estatística de Dados , Inferência Estatística , Confiabilidade dos Dados
12.
J Environ Radioact ; 272: 107358, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38142518

RESUMO

Radioactivity detection is a major research and development priority for many practical applications. Amongst the various technical challenges in this field is the need to carry out accurate low-level radioactivity measurements in the presence of a large fluctuations in the natural radiation background, while reducing the false alarm rates. The task becomes even more harder with high detection limits under low signal-to-background ratios. A detection method based on the statistical inference, following either a frequentist or a Bayesian paradigm, adopted to overcome these challenges as well as to ensure a reliable and accurate diagnosis with a competitive tradeoff between sensitivity, specificity and response time. With this respect, several research studies, addressing a range of applications from decommissioning and dismantling to homeland security, have been proposed. Our main goal in this paper is to present a succinct survey of these studies based on a frequentist and Bayesian approaches used to decision-making, uncertainty and risk evaluation, in the context of radioactive detection. In this prospect, a theoretical background of statistical frequentist and Bayesian inferences was presented. Then, a comparative study of both approaches was performed to determine the optimal approach in regards to accuracy and pros/cons. A case of study for low-level radioactivity detection in nuclear decommissioning operations was provided to validate the optimal approach. Results proved the efficiency and usefulness of Bayesian approach against frequentist one with respect to the most challenging scenarios in radiation detection applications.


Assuntos
Monitoramento de Radiação , Radioatividade , Teorema de Bayes , Incerteza
13.
bioRxiv ; 2023 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-38045416

RESUMO

Typical statistical practices in biological sciences have been increasingly called into question due to difficulties in replication of an increasing number of studies, much of which is confounded by the relative difficulty of null significance hypothesis testing designs and interpretation of p-values. Bayesian inference, representing a fundamentally different approach to hypothesis testing, is receiving renewed interest as a potential alternative or complement to traditional null significance hypothesis testing due to its ease of interpretation and explicit declarations of prior assumptions. Bayesian models are more mathematically complex than equivalent frequentist approaches, which have historically limited applications to simplified analysis cases. However, the advent of probability distribution sampling tools with exponential increases in computational power now allows for quick and robust inference under any distribution of data. Here we present a practical tutorial on the use of Bayesian inference in the context of neuroscientific studies. We first start with an intuitive discussion of Bayes' rule and inference followed by the formulation of Bayesian-based regression and ANOVA models using data from a variety of neuroscientific studies. We show how Bayesian inference leads to easily interpretable analysis of data while providing an open-source toolbox to facilitate the use of Bayesian tools.

14.
Ann Appl Stat ; 17(4): 3550-3569, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38106966

RESUMO

The Scientific Registry of Transplant Recipients (SRTR) system has become a rich resource for understanding the complex mechanisms of graft failure after kidney transplant, a crucial step for allocating organs effectively and implementing appropriate care. As transplant centers that treated patients might strongly confound graft failures, Cox models stratified by centers can eliminate their confounding effects. Also, since recipient age is a proven non-modifiable risk factor, a common practice is to fit models separately by recipient age groups. The moderate sample sizes, relative to the number of covariates, in some age groups may lead to biased maximum stratified partial likelihood estimates and unreliable confidence intervals even when samples still outnumber covariates. To draw reliable inference on a comprehensive list of risk factors measured from both donors and recipients in SRTR, we propose a de-biased lasso approach via quadratic programming for fitting stratified Cox models. We establish asymptotic properties and verify via simulations that our method produces consistent estimates and confidence intervals with nominal coverage probabilities. Accounting for nearly 100 confounders in SRTR, the de-biased method detects that the graft failure hazard nonlinearly increases with donor's age among all recipient age groups, and that organs from older donors more adversely impact the younger recipients. Our method also delineates the associations between graft failure and many risk factors such as recipients' primary diagnoses (e.g. polycystic disease, glomerular disease, and diabetes) and donor-recipient mismatches for human leukocyte antigen loci across recipient age groups. These results may inform the refinement of donor-recipient matching criteria for stakeholders.

15.
Virus Evol ; 9(2): vead068, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38107333

RESUMO

The Hepatitis C virus (HCV) envelope glycoprotein E1 forms a non-covalent heterodimer with E2, the main target of neutralizing antibodies. How E1-E2 interactions influence viral fitness and contribute to resistance to E2-specific antibodies remain largely unknown. We investigate this problem using a combination of fitness landscape and evolutionary modeling. Our analysis indicates that E1 and E2 proteins collectively mediate viral fitness and suggests that fitness-compensating E1 mutations may accelerate escape from E2-targeting antibodies. Our analysis also identifies a set of E2-specific human monoclonal antibodies that are predicted to be especially resilient to escape via genetic variation in both E1 and E2, providing directions for robust HCV vaccine development.

16.
J Am Stat Assoc ; 118(543): 1645-1658, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37982008

RESUMO

In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response - in other words, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such assessment does not necessarily characterize the prediction potential of features, and may provide a misleading reflection of the intrinsic value of these features. To address this limitation, we propose a general framework for nonparametric inference on interpretable algorithm-agnostic variable importance. We define variable importance as a population-level contrast between the oracle predictiveness of all available features versus all features except those under consideration. We propose a nonparametric efficient estimation procedure that allows the construction of valid confidence intervals, even when machine learning techniques are used. We also outline a valid strategy for testing the null importance hypothesis. Through simulations, we show that our proposal has good operating characteristics, and we illustrate its use with data from a study of an antibody against HIV-1 infection.

17.
Stud Health Technol Inform ; 308: 583-589, 2023 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-38007787

RESUMO

This study focuses on the analysis of the effects of potential factors on the Death Rate of Covid-19. The factors, in both demographic and economic aspects, have been discussed in the study through the extensive application of statistical inference, such as the Analysis of Variance (ANOVA), Univariate Linear Regression, and Multivariate Linear Regression. In the result section, the significance of the effects of these factors on Covid-19 has been carefully discussed.


Assuntos
COVID-19 , Humanos , Fatores Socioeconômicos , Análise Multivariada , Modelos Lineares
18.
Entropy (Basel) ; 25(10)2023 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-37895515

RESUMO

We study the excess minimum risk in statistical inference, defined as the difference between the minimum expected loss when estimating a random variable from an observed feature vector and the minimum expected loss when estimating the same random variable from a transformation (statistic) of the feature vector. After characterizing lossless transformations, i.e., transformations for which the excess risk is zero for all loss functions, we construct a partitioning test statistic for the hypothesis that a given transformation is lossless, and we show that for i.i.d. data the test is strongly consistent. More generally, we develop information-theoretic upper bounds on the excess risk that uniformly hold over fairly general classes of loss functions. Based on these bounds, we introduce the notion of a δ-lossless transformation and give sufficient conditions for a given transformation to be universally δ-lossless. Applications to classification, nonparametric regression, portfolio strategies, information bottlenecks, and deep learning are also surveyed.

19.
Rev. neurol. (Ed. impr.) ; 77(7)1 - 15 de Octubre 2023. tab
Artigo em Espanhol | IBECS | ID: ibc-226080

RESUMO

Cuando el investigador pide subvención y autorización a entidades financieras para llevar a cabo su proyecto, entre las primeras cuestiones que le plantean está: ¿qué potencia estadística tiene este estudio que usted propone? Si el investigador responde, por ejemplo, el 90%, y el evaluador se da por satisfecho, es seguro que no conoce realmente el tema. La potencia de un estudio no es única. Depende de determinados parámetros y ocurre que, en la mayoría de los casos, variando ligeramente los valores de esos parámetros, la potencia toma un valor aceptable. Si no es así, y a pesar de ello se lleva a cabo el estudio, y sus resultados son muy significativos, no ha lugar a cuestionar el éxito encontrado argumentando que el estudio tenía poca potencia. Tan sólo es momento de celebrarlo. (AU)


When researchers request funding and authorisation from financial institutions to carry out their project, one of the first questions they are asked is: what is the statistical power of the study you are proposing? If the researcher answers, for example, 90%, and the evaluator is satisfied, it is certain that he/she is not really familiar with the subject. The power of a study is not unique. It depends on certain parameters and what happens is that, in most cases, by introducing a slight variation in the values of these parameters, the power takes on an acceptable value. If this is not the case and the study is carried out anyway, and its results are very significant, there is no room to question its success by arguing that the power of the study was very low. It is just the time to celebrate. (AU)


Assuntos
Distribuições Estatísticas , Interpretação Estatística de Dados , Modelos Estatísticos , Indicadores (Estatística) , Medição de Risco/métodos , Medição de Risco/estatística & dados numéricos , Estatística como Assunto
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...